Online Imbalanced Learning with Kernels

نویسندگان

  • Haiqin Yang
  • Junjie Hu
  • Michael R. Lyu
چکیده

Imbalanced learning, or learning from imbalanced data, is a challenging problem in both academy and industry. Nowadays, the streaming imbalanced data become popular and trigger the volume, velocity, and variety issues of learning from these data. To tackle these issues, online learning algorithms are proposed to learn a linear classifier via maximizing the AUC score. However, the developed linear classifiers ignore the learning power of kernels. In this paper, we therefore propose online imbalanced learning with kernels (OILK) to exploit the non-linearity and heterogeneity embedded in the imbalanced data. Different from previously proposed work, we optimize the AUC score to learn a non-linear representation via the kernel trick. To relieve the computational and storing cost, we also investigate different buffer update policies, including first-in-first-out (FIFO) and reservoir sampling (RS), to maintain a fixed budgeted buffer on the number of support vectors. We demonstrate the properties of our proposed OILK through detailed experiments.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cost Sensitive Online Multiple Kernel Classification

Learning from data streams has been an important open research problem in the era of big data analytics. This paper investigates supervised machine learning techniques for mining data streams with application to online anomaly detection. Unlike conventional machine learning tasks, machine learning from data streams for online anomaly detection has several challenges: (i) data arriving sequentia...

متن کامل

Large-Scale Learning with Structural Kernels for Class-Imbalanced Datasets

Much of the success in machine learning can be attributed to the ability of learning methods to adequately represent, extract, and exploit inherent structure present in the data under interest. Kernel methods represent a rich family of techniques that harvest on this principle. Domain-specific kernels are able to exploit rich structural information present in the input data to deliver state of ...

متن کامل

A Framework of Online Learning with Imbalanced Streaming Data

A challenge for mining large-scale streaming data overlooked by most existing studies on online learning is the skewdistribution of examples over different classes. Many previous works have considered cost-sensitive approaches in an online setting for streaming data, where fixed costs are assigned to different classes, or ad-hoc costs are adapted based on the distribution of data received so fa...

متن کامل

Balance Support Vector Machines Locally Using the Structural Similarity Kernel

A structural similarity kernel is presented in this paper for SVM learning, especially for learning with imbalanced datasets. Kernels in SVM are usually pairwise, comparing the similarity of two examples only using their feature vectors. By building a neighborhood graph (kNN graph) using the training examples, we propose to utilize the similarity of linking structures of two nodes as an additio...

متن کامل

Online Learning with Multiple Operator-valued Kernels

We consider the problem of learning a vector-valued function f in an online learning setting. The function f is assumed to lie in a reproducing Hilbert space of operator-valued kernels. We describe two online algorithms for learning f while taking into account the output structure. A first contribution is an algorithm, ONORMA, that extends the standard kernel-based online learning algorithm NOR...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013